Targeted Gene Metagenomic Data Analysis ◾ 283
perform quality filtering and chimera removal. You may not need to perform any quality
control prior using any of these two methods. Only for deblur denoising, you may need to
merge paired-end reads as we did for the clustering.
7.3.4.2.2.1 Denoising with DADA2
The “q2-dada2” plugin is used to denoise single-end and paired-end reads (no need for
read merging). Use “denoise-single” method for the single-end reads and “denoise-paired”
method for the paired-end reads. You can always use “--help” with any of the methods to
display the usage and options. DADA2 methods denoise sequences, dereplicate them, and
filter chimeras.
qiime dada2 denoise-single --help
qiime dada2 denoise-paired --help
Now, we can use “q2-dada2” to denoise the yoga data that we imported and saved as
“demux-yoga.qza”. We will use “denoise-paired” method. To keep the files organized, we
will create the “dada2” subdirectory for DADA2 denoising files.
mkdir dada2
qiime dada2 denoise-paired \
--i-demultiplexed-seqs inputs/demux-yoga.qza \
--p-trim-left-f 0 \
--p-trim-left-r 0 \
--p-trunc-len-f 250 \
--p-trunc-len-r 250 \
--p-n-threads 4 \
--o-representative-sequences dada2/rep-seqs_yoga_dada2.qza \
--o-table dada2/table_yoga_dada2.qza \
--o-denoising-stats dada2/stats_yoga_dada2.qza
The parameters “--p-trim-left-f”, “--p-trim-left-r”, “--p-trunc-len-f”, and “--p-trunc-len-r”
are optional, and they are used to trim and truncate the forward and reverse sequences,
respectively, to improve the quality of the reads if required. If you use “--p-trunc-len-f 0”
and “--p-trunc-len-r 0”, the truncation will be disabled. The parameter “--p-n-threads”
specifies the number of threads used for denoising. If the “denoise-single” method is used
with paired-end reads instead of “denoise-paired”, only forward reads will be used as input
while the reverse reads will be ignored.
We set “--p-trunc-len-f 250” and “--p-trunc-len-r 250” to truncate the forward and
reverse reads to 250 bases. However, we did not trim the left ends of the reads because they
do not need trimming. Like clustering, DADA2 feature table and representative sequences
artifacts are used for the downstream analysis for phylogeny, diversity analysis, taxonomy
assignment, etc. Moreover, the DADA2 stats summary artifact contains useful information
regarding the filtering and denoising. Users can use “q2-metadata tabulate” to generate a
visualization file that can be displayed on the Internet browser with “tools view” as follows: